An Empirical Study of Hierarchical Dirichlet Process Priors for Grammar Induction
نویسندگان
چکیده
In probabilistic grammar induction, to avoid overfitting, simplicity priors are often used, which favor smaller grammars. An example of simplicity priors is Solomonoff’s universal probability distribution , where is the description length of the grammar G. The Hierarchical Dirichlet process (HDP) [Teh, et al., 2006] has recently been used as a prior for the transition probabilities of a probabilistic grammar [Teh, et al., 2006; Liang, et al, 2007; Finkel, et al, 2007]. ● It is a kind of nonparametric Bayesian model. ● It can be naturally incorporated into the graphical model of the grammar, so many sophisticated inference algorithms can be used for grammar induction. We want to find out ● how the HDP prior probability of a grammar changes with the description length of the grammar (compared with the universal probability distribution) ● how the parameters of HDP affect its behavior
منابع مشابه
Covariance in Unsupervised Learning of Probabilistic Grammars
Probabilistic grammars offer great flexibility in modeling discrete sequential data like natural language text. Their symbolic component is amenable to inspection by humans, while their probabilistic component helps resolve ambiguity. They also permit the use of well-understood, generalpurpose learning algorithms. There has been an increased interest in using probabilistic grammars in the Bayes...
متن کاملProbabilistic Grammars and Hierarchical Dirichlet Processes
Probabilistic context-free grammars (PCFGs) have played an important role in the modeling of syntax in natural language processing and other applications, but choosing the proper model complexity is often difficult. We present a nonparametric Bayesian generalization of the PCFG based on the hierarchical Dirichlet process (HDP). In our HDP-PCFG model, the effective complexity of the grammar can ...
متن کاملSpike train entropy-rate estimation using hierarchical Dirichlet process priors
Entropy rate quantifies the amount of disorder in a stochastic process. For spiking neurons, the entropy rate places an upper bound on the rate at which the spike train can convey stimulus information, and a large literature has focused on the problem of estimating entropy rate from spike train data. Here we present Bayes least squares and empirical Bayesian entropy rate estimators for binary s...
متن کاملComputing with Priors That Support Identiiable Semiparametric Models Computing with Priors That Support Identiiable Semiparametric Models
When used as one component in the prior for a semiparametric hierarchical model, the Dirichlet process may not support an identiiable sampling model. In this note, two variations of the Dirichlet process are studied; a median zero process deened by Doss and a process having xed location and scale. Polya sequence theory is described for both, with the focus being on tools for Gibbs sampling. In ...
متن کاملGibbs Sampling Methods for Stick-Breaking Priors
A rich and exible class of random probability measures, which we call stick-breaking priors, can be constructed using a sequence of independent beta random variables. Examples of random measures that have this characterization include the Dirichlet process, its two-parameter extension, the two-parameter Poisson–Dirichlet process, nite dimensional Dirichlet priors, and beta two-parameter pro...
متن کامل